Segmentation of touching and fused Devanagari characters
نویسندگان
چکیده
Devanagari script is a two dimensional composition of symbols. It is highly cumbersome to treat each composite character as a separate atomic symbol because such combinations are very large in number. This paper presents a two pass algorithm for the segmentation and decomposition of Devanagari composite characters/symbols into their constituent symbols. The proposed algorithm extensively uses structural properties of the script. In the first pass, words are segmented into easily separable characters/composite characters. Statistical information about the height and width of each separated box is used to hypothesize whether a character box is composite. In the second pass, the hypothesized composite characters are further segmented. A recognition rate of 85 percent has been achieved on the segmented conjuncts. The algorithm is designed to segment a pair of touching characters.
منابع مشابه
Segmentation of Touching Hand written Telugu Characters by using Drop Fall Algorithm
Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Telugu script faces major problems mainly related to the touching and overlapping of characters. Segmentation ...
متن کاملHMM-based Indic handwritten word recognition using zone segmentation
This paper presents a novel approach towards Indic handwritten word recognition using zone-wise information. Because of complex nature due to compound characters, modifiers, overlapping and touching, etc., character segmentation and recognition is a tedious job in Indic scripts (e.g. Devanagari, Bangla, Gurumukhi, and other similar scripts). To avoid character segmentation in such scripts, HMMb...
متن کاملA Novel Approach of Segmenting Touching and Kerned Characters
Character segmentation is a critical step of OCR system. In this paper we discussed segmentation approaches of touching and kerned characters.A non-linear segmentation pathbased algorithm for segmenting touching and kerned characters is put forward. First, touching and kerned characters are extracted and segregated with other characters by using character projections and recognition results.The...
متن کاملMulti-oriented touching text character segmentation in graphical documents using dynamic programming
The touching character segmentation problem becomes complex when touching strings are multioriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characte...
متن کاملSegmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition
Segmentation of a word into characters is one of the important challenges in optical character recognition. This is even more challenging when we segment characters in an offline handwritten document. Touching characters make this problem more complex. In this paper, we have applied water reservoir based technique for identification and segmentation of touching characters in handwritten Gurmukh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition
دوره 35 شماره
صفحات -
تاریخ انتشار 2002